Test: Test PR to test codeserver for Z #1621

Meghagaur · 2025-10-08T14:52:55Z

Test PR not intented for merge. quick verification of codeserver build on Konflux , checks Dockerfile and platform-specific pylock/pyproject changes.

openshift-ci · 2025-10-08T14:53:08Z

Hi @Meghagaur. Thanks for your PR.

I'm waiting for a red-hat-data-services member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Meghagaur · 2025-10-08T14:54:35Z

/build-konflux

Nash-123 · 2025-10-08T15:10:57Z

/ok-to-test

Nash-123 · 2025-10-08T15:11:07Z

/build-konflux

openshift-ci · 2025-10-08T15:34:14Z

@Meghagaur: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/images	`c50cb43`	link	true	`/test images`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Meghagaur · 2025-10-08T17:41:52Z

/build-konflux

Meghagaur · 2025-10-09T05:19:47Z

/build-konflux

Nash-123 · 2025-10-09T09:35:30Z

/build-konflux

openshift-ci · 2025-10-09T17:08:40Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign daniellutz for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Nash-123 · 2025-10-10T05:17:19Z

/build-konflux

Nash-123 · 2025-10-10T07:36:13Z

/build-konflux

Meghagaur · 2025-10-14T07:00:42Z

/build-konflux

jiridanek · 2025-10-14T15:05:58Z

/build-konflux

…rm64" (opendatahub-io#2574) This reverts commit c8ff00a Originally added in opendatahub-io#1396 because of Pipenv limitations that are no longer present in uv.

…pipeline The error message you provided, `(error: exit status 1; output: write /opt/app-root/lib/python3.12/site-packages/nvidia/nccl/lib/libnccl.so.2: no space left on device)`, indicates a **disk space limitation** encountered during the container build process, specifically while writing an Nvidia-related Python package file. This issue, commonly reported as `No space left on device`, generally occurs when the build pipeline attempts to write more data than is available in a shared volume or local ephemeral storage. Here is a detailed analysis and potential solutions based on the source material, particularly those dealing with large container images and multi-platform builds: ### Root Cause and General Solution 1. **Shared Volume Overflow:** The error likely means your build pipeline is consuming too much space in a shared volume, typically the workspace declared in your `PipelineRun` YAML. The default Tekton workspace size is often small (e.g., 1GB). 2. **Solution: Increase Workspace Storage:** The standard recommendation is to **request more disk space** by increasing the storage value within the `.spec.workspaces` section of your relevant `PipelineRun` files: * For example, one user solved a `prefetch-dependencies` task failure by increasing storage to 2Gi. * However, for very large builds (like those involving AI/Nvidia libraries, as your error suggests), you may need significantly more space. One user noted that large images may require **2 to 3 times the actual file size** during building and tagging. ### Context specific to Large/AI/Multi-Arch Builds Your specific error involving `/opt/app-root/lib/python3.12/site-packages/nvidia/nccl/lib/libnccl.so.2` places this failure within the context of building large containers that rely on extensive dependencies, often seen with RHEL AI/AIPCC teams. * **Large Image Size:** Tasks involving machine learning or large model images (sometimes referred to as "modelcar" images) frequently face this issue because the artifacts being built are very large. For instance, certain AI models require ephemeral storage volumes that can exceed 200Gi. * **Aarch64/ARM64 Architecture:** In known instances of this exact type of error (involving unpacking Nvidia/vllm dependencies), the failure consistently occurred on **aarch64/arm64 builds**, while x86_64 builds passed. Default disk size for these nodes may be limited (e.g., 40Gb). * **Failure Point:** The failure you see is occurring during the process of **copying layers and metadata for the container**, likely during the unpacking or committing phase, pointing to the local ephemeral storage running out of space. ### Platform-Specific Workaround Since the issue seems related to the underlying machine size, especially if you are targeting AArch64 (ARM64), a suggested fix is to utilize a remote platform with guaranteed larger disk space: * **Override Platform:** You can try replacing the default build platform in your `PipelineRun` configuration with a larger machine type. For **arm64** builds experiencing this failure, a recommendation was made to use `linux-d160-m2xlarge/arm64`, which provides **160 GB of disk space**. * **Configuring `buildah-remote`:** If you are using a remote build task (like `buildah-remote`), you would need to specify this larger platform flavor in the configuration. If increasing the default workspace size is insufficient, addressing the underlying node size for large builds is crucial.

…age components and new build-platform entries for specific components

…io#2568) * Enabled TrustyAI Notebook for s390x Signed-off-by: Nishan Acharya <[email protected]> * Address Comments Signed-off-by: Nishan Acharya <[email protected]> * Removed EPEL from output stage Signed-off-by: Nishan Acharya <[email protected]> * Add s390x label for konflux Signed-off-by: Nishan Acharya <[email protected]> --------- Signed-off-by: Nishan Acharya <[email protected]>

…to sync-x1

…io#2575) * s390x changes for codeserver * Update get_code_server_rpm.sh Fix conditional syntax for architecture guard. The missing space before "$ARCH" turns the token into ||"$ARCH", so bash complains with “conditional binary operator expected” and exits before any build logic runs. as suggested by code coderabbitai * Update devel_env_setup.sh changed to do proper && chain for dnf * add files via lfs --------- Co-authored-by: aryabjena <[email protected]>

Sync rhds:main from odh:main

Add BASE_IMAGE as buildArg for the builds configs and remove the aipcc bases as we got unauthorized access

Switch from aippc to ubi/rhel images for rstudio

openshift-ci bot requested review from daniellutz and jiridanek October 8, 2025 14:53

github-actions bot added the review-requested label Oct 8, 2025

openshift-ci bot added the needs-ok-to-test label Oct 8, 2025

openshift-ci bot added ok-to-test and removed needs-ok-to-test labels Oct 8, 2025

Meghagaur changed the base branch from main to rhoai-3.0 October 9, 2025 09:32

Meghagaur force-pushed the s390x-codeserver-me branch from c50cb43 to f6091a7 Compare October 9, 2025 17:08

Meghagaur mentioned this pull request Oct 11, 2025

Added changes to enable codeserver on s390x architecture #1627

Open

5 tasks

Meghagaur force-pushed the s390x-codeserver-me branch from aaf09fd to 001c4f6 Compare October 13, 2025 19:22

Meghagaur mentioned this pull request Oct 14, 2025

Add s390x Support for Codeserver Notebook opendatahub-io/notebooks#2573

Closed

5 tasks

jiridanek force-pushed the s390x-codeserver-me branch from 65abde5 to a91156b Compare October 14, 2025 15:05

aryabjena mentioned this pull request Oct 14, 2025

RHAIENG-1475: enablement(s390x): changes for codeserver opendatahub-io/notebooks#2575

Merged

5 tasks

jiridanek and others added 4 commits October 15, 2025 08:14

RHOAIENG-36046: revert "Add hdf5 from EPEL to tensorflow images for a…

5d0f2d7

…rm64" (opendatahub-io#2574) This reverts commit c8ff00a Originally added in opendatahub-io#1396 because of Pipenv limitations that are no longer present in uv.

chore(tekton): add conditional service account suffix for odh-base-im…

46bbdf6

…age components and new build-platform entries for specific components

atheo89 and others added 8 commits October 15, 2025 10:40

Merge branch 'main' of https://github.com/opendatahub-io/notebooks in…

f2b3f60

…to sync-x1

Add BASE_IMAGE as buildArg for the builds configs

581c033

Switch from aippc to ubi/rhel images for rstudio

b1ecb47

Merge pull request red-hat-data-services#1635 from atheo89/sync-x1

5910f3c

Sync rhds:main from odh:main

Merge pull request red-hat-data-services#1636 from atheo89/fix-config-r

03722e0

Add BASE_IMAGE as buildArg for the builds configs and remove the aipcc bases as we got unauthorized access

Merge pull request opendatahub-io#2579 from atheo89/rstudio-fixes

aaf4bcf

Switch from aippc to ubi/rhel images for rstudio

Merge remote-tracking branch 'upstream/main'

9101b77

Meghagaur force-pushed the s390x-codeserver-me branch from a91156b to 9101b77 Compare October 15, 2025 11:02

updated s390x codeserver Dockerfile.konflux

805d957

Test: Test PR to test codeserver for Z #1621

Are you sure you want to change the base?

Test: Test PR to test codeserver for Z #1621

Conversation

Meghagaur commented Oct 8, 2025

Uh oh!

openshift-ci bot commented Oct 8, 2025

Uh oh!

Meghagaur commented Oct 8, 2025

Uh oh!

Nash-123 commented Oct 8, 2025

Uh oh!

Nash-123 commented Oct 8, 2025

Uh oh!

openshift-ci bot commented Oct 8, 2025

Uh oh!

Meghagaur commented Oct 8, 2025

Uh oh!

Meghagaur commented Oct 9, 2025

Uh oh!

Nash-123 commented Oct 9, 2025

Uh oh!

openshift-ci bot commented Oct 9, 2025

Uh oh!

Nash-123 commented Oct 10, 2025

Uh oh!

Nash-123 commented Oct 10, 2025

Uh oh!

Meghagaur commented Oct 14, 2025

Uh oh!

jiridanek commented Oct 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants